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Abstract — Calibration of a sensor array is more involved if the 
antennas have direction dependent gains and multiple calibrator 
sources are simultaneously present. We study this case for a 
sensor array with arbitrary geometry but identical elements, 
i.e. elements with the same direction dependent gain pattern. A 
weighted alternating least squares (WALS) algorithm is derived 
that iteratively solves for the direction independent complex gains 
of the array elements, their noise powers and their gains in the 
. . . direction of the calibrator sources. An extension of the problem 
' is the case where the apparent calibrator source locations are 
T-H unknown, e.g., due to refractive propagation paths. For this 
case, the WALS method is supplemented with weighted subspace 
04 fitting (WSF) direction flnding techniques. Using Monte Carlo 
; I simulations we demonstrate that both methods are asymptotically 
, statistically efficient and converge within two iterations even in 
, cases of low SNR. 

Index Terms — array signal processing, radio astronomy, self- 
• calibration 



I. Introduction 

In this paper we study the calibration of the direction 
dependent response of sensor array antennas, excited by si- 
multaneously present calibrator sources. The antenna array 
has arbitrary geometry but identical antennas. The calibration 
involves the complex gain of the antenna elements towards 
each source. The source powers are known but we will allow 
for small deviations in apparent source locations to account 
for, e.g., refractive propagation paths. This problem is one 
of the main challenges currently faced in the field of radio 
astronomy. For low frequency observations (< 300 MHz) 
this community is building or developing a number of new 
instruments, for example the low frequency array (LOFAR) 
[1], the Murchison wide field array (MWA) [2] and the 
primeval structure telescope (PaST) [3], which are all large 
irregular phased arrays. These difficulties arise due to the 
influence of the ionosphere on the propagation of radio waves, 
which can qualitatively be categorized into the following four 
regimes depending on the field of view (FOV) of the individual 
receptors and the baselines between them [4]: 

1) All antennas and all lines of sight sample the same iono- 
spheric delays, thus the ionosphere causes no distortion 
of the array manifold (small FOV, short baselines). 

2) Lines of sight from different antennas sample different 
ionospheric patches, but all sources in the antenna FOV 
experience the same delay; the ionosphere thus causes an 
antenna based gain effect (small FOV, long baselines). 
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Fig. 1. Graphical representation of regime 3: all lines of sight towards a 
single source sample the same ionosphere, but the ionospheric delay differs 
per source due to the large FOV and short baselines (after [4]). 



3) All lines of sight towards a single source sample the 
same ionosphere but the delay differs per source. This 
regime requires source based direction dependent cali- 
bration (large FOV, short baselines, see Fig. [T]i. 

4) The ionospheric delays differ per station and per source 
(large FOV, long baselines). 

The first two regimes can be handled under the self- 
calibration assumption used in astronomy that all errors are 
telescope based thereby reducing the estimation problem to a 
single direction independent gain factor per telescope [5]. The 
problem of estimating these direction independent gains based 
on a single calibrator source is treated in [6], while the multiple 
source case has been discussed in [7]. The problem of finding 
a complex gain per antenna per source, the fourth regime, 
is not tractable without further assumptions which allow to 
parameterize the behavior of the gains over space, time and/or 
frequency [8]. In this paper we will focus on the third regime, 
which is sketched in Fig.[Tl thus filling the gap in the available 
literature. It allows for the calibration of individual closely 
packed groups of antennas such as a LOFAR station or a 
subarray of the MWA and PaST telescopes and thus forms 
a valuable step towards the calibration of the whole array. 

We will state our problem in general terms, since the 
problem plays a role in a range of applications varying from 
underwater acoustics to antenna arrays. This explains the 
persistent interest in on-line calibration, autocalibration or self- 
calibration of sensor arrays [8]-[12]. In most applications the 
main driver for studies on array calibration is to improve 
the DOA estimation accuracy. Many studies in this field 
therefore try to solve for the DOAs and a number of array 
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parameters. In [9], [13], [14] self-calibration schemes are 
presented which solve for the direction independent gains 
and the sensor positions, i.e. the directional response of the 
sensors is assumed known. Other studies assume a controlled 
environment to calibrate the array by measuring the calibrator 
sources one at a time [15], [16] or exploit the array geometry, 
e.g. a uniform linear array (ULA) having a Toeplitz matrix 
as array co variance matrix [10], [17]. Weiss and Friedlander 
[18] have presented a technique for almost fully blind signal 
estimation. Their work, however, focuses on estimation and 
separation of source signals, not on characterizing the array 
itself. We thus feel that the problem at hand also forms an 
interesting addition to the literature available on sensor array 
calibration in general. 

In the next section we introduce the data model and provide 
a mathematical formulation of the problem. In Sec. |III] the 
Cramer-Rao bound for this estimation problem is discussed. 
Section |IV] presents an alternating least squares (ALS) and 
a weighted alternating least squares (WALS) approach op- 
timizing subsets of parameters iteratively. These algorithms 
are validated using Monte Carlo simulations in Sec. |V] The 
conclusions are drawn in Sec. |Vll 

Notation: Overbar (•) denotes conjugation, the transpose 
operator is denoted by ^, the complex conjugate (Hermitian) 
transpose by ^ and the pseudo-inverse by ^ . An estimated 
value is denoted by £{■}. is the element-wise matrix 
multiplication (Hadamard product), is the element-wise 
matrix division, denotes the Kronecker product and o is 
used to denote the Khatri-Rao or column-wise Kronecker 
product of two matrices, diag(-) converts a vector to a diagonal 
matrix with the vector placed on the main diagonal, vecdiag(-) 
produces a vector from the elements of the main diagonal of its 
argument and vec(-) converts a matrix to a vector by stacking 
the columns of the matrix. We exploit many properties of 
Kronecker products, including the following (for matrices and 
vectors of compatible dimensions); 

vec(ABC) = (C^0A)vec(B) (1) 
vec(Adiag(b)C) = (C^ o A)b (2) 
(AoB)'f^(CoD) = A^C0B^D (3) 

II. Data model 

We consider an array of p elements located at a single site 
("Regime 3"). Denote the complex baseband output signal of 
the ith array element by Xiit) and define the aiTay signal vector 
x(t) = [xi{t), X2{t), ■ ■ ■ , Xp{t)]'^ . We assume the presence of 
q mutually independent i.i.d. Gaussian signals Sk{t) impinging 
on the array, which are stacked in a </ x 1 vector s(t). Likewise 
the sensor noise signals ni{t) are assumed to be mutually 
independent i.i.d. Gaussian signals and are stacked in a p x 1 
vector n(<). If the narrow band condition holds [19], we can 
define the q spatial signature vectors (pxl), which describe 
for the kth source the phase delays at each antenna due only 
to the geometry. 

The q sources are considered calibrator sources, thus we 
assume that their powers, their nominal positions, and the 
locations of the antennas (hence a^) are known. Refractive 



effects caused by ionospheric phase gradients may shift the 
apparent locations of the sources, thus we will also consider 
cases where the a^ are only parametrically known. 

The sensors are assumed to have the same direction depen- 
dent gain behavior towards the q source signals received by 
the array. This can include the antenna pattern and ionospheric 
phase effects. They are described by gain factors gofe and 
are collected in a matrix Gq = diag([goi, .9o2, ■ ■ • ,9oq])- 
The direction independent gains and phases (individual re- 
ceiver gains) can be described as 7 = [71,72, ■■ ■ >7p]"^ and 
cj) = [d'^ijcj'^^, • • • jcj'^p]-^ respectively with corresponding 
diagonal matrix forms F — diag(7) and 4> = diag(^). With 
these definitions, the array signal vector can be described as 

x(i) = r* SLkgokSkit)^ + n(t) = GAGos(f) + n{t) 

(4) 

where A = [ai, • • • , a.k] (size p x q) and G = r$; for later 
use we also define g = 7 ^. 

The signal is sampled with period T and N sam- 
ple vectors are stacked into a data matrix X = 
[x(T),x(2r), • • • ,x(iVr)]. The covariance matrix of x(t) is 
R = £{x{t)x"{t)} and is estimated by R = A^-^XX^. 
Likewise, the source signal covariance Sg = diag((Ts) where 
(7s = [Csij o's2: ' ' ' , '^sql'^ ^nd the noise covariance matrix is 
E„ = diag((T„) where cr„ = [(T^i,cr?j2> ' ' ' :'^lpV- Then the 
model for R based on (|4]i is 

R=GAGo5],,G^A^G^ + S„. (5) 

In this model, Ss is known. Since Gq and are diagonal 
matrices, we can introduce 

S = GqSsGo^ 

^ diag([|5oipfT,^, • • • , IffOglVsJ) = diag(cr) . (6) 

Since the direction dependent gains gofc are not known, the 
real valued elements of cr = [af, cr|, • • • , cr^]-^ are not known 
either This implies that our problem is identical to estimating 
an unknown diagonal signal covariance matrix without any 
DOA dependent errors. We may thus restate ^ as 

R = GAEA^G^ + S„ (7) 

and solve for G, S and 5]„ under the assumption that A is 
known (or parametrically known). 

The data model described by ^ is commonly used in papers 
on sensor array calibration (e.g. [6], [20]). Flanagan and Bell 
[9], Weiss and Friedlander [13] and See [14] effectively use 
the same model, but focus on position calibration of the array 
elements and are therefore more explicit on the form of A. 
If the source positions and locations of the sensors within the 
array are known, an explicit formula for a^ can be used to 
compute the nominal spatial signature vectors. Estimation of 
source locations is required to account for the refractive effects 
produced by an ionospheric phase gradient. 

The zth element of the array is located at = 
[xi, Hi, ZiY" . These positions can be stacked in a matrix TZ = 
[ri, r2, • • • , Vp]^ (size p x 3). The position of the fcth source 
can be denoted as 1^ = [lk,mk,nk\^ . The source positions 
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can be stacked in a matrix £ = [li, I2, • • • , Iq]'^ (size q x 3). 
The spatial signature matrix A can thus be described by 



, 27r ^ 
A = exp I —}—TZC 



(8) 



where the exponential function is applied element-wise to its 
argument. In the remainder of this paper we will speciaUze to 
a planar array having Zi — for convenience of presentation 
but without loss of generality. 

From (Q we observe that G and S share a common 
scalar factor. Therefore we may impose the boundary condition 
af = 1. To solve the ambiguity in the phase solution of G we 
will take the first element as phase reference, i.e. = 
is imposedQ When solving for source locations a similar 
problem occurs: a single rotation of all DOA vectors can be 
compensated by the direction independent gain phase solution. 
We will therefore fix the position of the first source. 

In this paper we will address four related sensor array 
calibration problems based on this data model. These are 
summarized below where the parameter vectors adhering to 
the aforementioned boundary conditions are stated explicitly. 

1) The sensor noise powers are the same for aU ele- 



j2 

nq 



ments, i.e. cr„j 

S„ = cr^I where I is the identity matrix. In this 
scenario the parameter vector to be estimated is 6 ^ 

2) The sensor noise powers are allowed to differ 
from one element to the other, i.e. S„ = 
diag((T„). In this case the parameter vector is 



X 2 2 2 



3) S„ = cr^I and A = A(£), i.e. similar to the first 
scenario but with unknown source locations. In this case 

e -- 

4) 



[7^,02, 



2 2 iT 



J J 



TlT 



[7^,02, 



diag(cr„) and A = A(£), giving 6 



,1 



TlT 



1 '■q J 



III. CRB Analysis 



The Cramer-Rao bound (CRB) on the error variance for any 
unbiased estimator states that [22] 

c = £\^{e-e){e-ey^>^3-\ (9) 

where C is the covariance matrix of 6, and C is the Fisher 
Information Matrix (FIM). For Gaussian signals the FIM can 
be expressed as 



J = Tr ^ ® R-M F 



(10) 



where F is the Jacobian evaluated at the true values of the 
parameters, i.e. 

^ ^vec(R) 



59' 



e 



(11) 



' In [21] it is shown that X]f=i fl^i = is the optimal constraint for this 
problem. This constraint has the disadvantage that the location of the phase 
reference is not well defined. Furthermore, the choice for the constraint used 
here simplifies our analysis in combination with the constraints required to 
uniquely identify the source locations and the apparent source powers. 



For the calibration problem defined by the first scenario, 
the Jacobian can be partitioned in four parts following the 
structure in 6: 



where it can be shown that 



GRo* ol + lo GRo* 



j((GRoG)oI-Io(GRoG))l, 

((GA)o(GA))l, 

vcc(I) 



(12) 

(13) 
(14) 
(15) 
(16) 



Is is a selection matrix of appropriate size equal to the identity 
matrix with its first column removed so that the derivatives 
with respect to 0i and al are omitted. 

In the second scenario the Jacobian can be partitioned in a 
similar way, the difference being that the expression for F^r^ 
given by Eq. (fT6] l should be replaced by 



lol 



(17) 



In the third and fourth scenarios an additional component 
Fi is added to the Jacobian containing the derivatives of 
vec(R) with respect to the source position coordinates. This 
component can be partitioned as Fi = [F;,F,„]. Introducing 

: diag([a:;i,a;2,---a;p]^)G (18) 
: diag([yi,2/2,---yp]^)G (19) 



Gj. 
G,, 



these components can be conveniently written as 

F, = -jy (GAoG,A-G;:AoGA)£Is (20) 
F™ = -j^(GAoG,yA-G;AoGA)SIs (21) 

These equations show that the entries of the Jacobian related 
to derivatives with respect to the I- and ?Ti-coordinates of 
the sources are proportional to the x- and j/-coordinates of 
the array elements respectively. The physical interpretation 
of this relation is that a plane wave propagating along the 
coordinate axis of the coordinate to be estimated provides a 
more useful test signal to estimate the source location than a 
signal propagating perpendicular to this axis. 

IV. Algorithms 

A. Generalized least squares formulation 

An asymptotically efficient estimate of the model param- 
eters 9 can be obtained via the ML formulation. Since 
all signals are assumed to be i.i.d. Gaussian signals, the 
derivation is standard and ML parameter estimates for N 
independent samples are obtained by minimizing the negative 
log-likelihood function [23] 



e = argmin (lii|R(0)| +tr (R-i(0)R 
9 



(22) 



where R(0) is the model covariance matrix as function of 9 
and R is the sample covariance matrix A^^^XX^. 

It does not seem possible to solve this minimization problem 
in closed form. As discussed in [23] a weighted least squares 
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covariance matching approach is known to lead to estimates 
that are, for a large number of samples, equivalent to ML 
estimates and are therefore asymptotically efficient and reach 
the CRB. 

The least squares covariance model fitting problem can be 
defined as 



e 



argmm 

e 



R - R(0) 



(23) 



Equivalently, we consider minimization of the cost function 



K{e)= R-R(0) =i"{e)i{9) 



(24) 



where t{9) = vec The more general weighted 

least squares problem is obtained by introducing a weighting 
matrix W and optimizing 



K^{9) = i"{e)wf{e). 



(25) 



The optimal weight is known to be the inverse of the asymp- 
totic covariance of the residuals, £ {{{Oo)i{Oo)^} , where Oq 
is the true value of the parameters [23]. The optimal weight 
for Gaussian sources is thus 



W 



opt 



(R® R) 



R 



R- 



(26) 



The Kronecker structure of Wopt allows to introduce Wc = 
R~ 2 and write the weighted least squares (WLS) cost function 
as 



/cwW- WjR-R(0) W, 



(27) 



As mentioned, this estimator is asymptotically unbiased and 
asymptotically efficient [23]. 

We propose to solve the least squares problems for the four 
cases defined in section|II]by alternating between least squares 
solutions to subsets of parameters. The subsets are chosen such 
that the solution is either available in the literature or can be 
derived analytically. We will have four subsets of parameters: 
the complex direction independent gains of the elements g, the 
apparent source powers a, the receiver noise powers cr„, and 
the source locations C. In the following subsections we will 
develop the unweighted and weighted least squares solutions 
for these subsets before putting them together to form the 
alternating least squares (ALS) or weighted alternating least 
squares (WALS) method respectively. 



B. Estimation of direction independent gains 

The least squares problem to find the omnidirectional gains 
g based on the weighted cost function kw (^) can be formu- 
lated as 



g 



argmm 
g 



(W,® WJ vcc R - S 



(Wc ® W,) diag (vec (Rq)) (g ® g) 



(28) 



where we have introduced Ro = ASA^; in this subsection, 
Rq is assumed to be known. This problem can be solved 



using standard techniques by regarding g and g as indepen- 
dent vector parameters and alternatingly solve for them until 
convergencqj In this approach, the solution for g is 



g = argmm 
s 



(Wc ® Wc) vec (r - S„ 



(Wc(55Wc) ((RoG^) ol)g 



R^G^R iGRo 0R^ 

H 



R iGRn o R 



vec ( R — S„ 



(29) 



Since the result for g is simply the complex conjugate of 
this relation, it is sufficient to apply Eq. ( [29] ) repeatedly 
until convergence. Although ALS ensures that the value of 
the cost function decreases in each iteration, it does not 
guarantee convergence to the global minimum, especially if 
the initial estimate is poor The number of iterations required 
for convergence also depends strongly on the vicinity of the 
initial estimate to the true value. We are thus interested in 
obtaining a good initial estimate for g. 

Fuhrmann [20] has proposed to use a suboptimal closed 
form solution to initialize the Newton iterations used to solve 
the ML cost function under the assumption that S„ is known 
(or actually no noise is present). A similar problem (with 
unknown S„ and Ro a rank 1 matrix, i.e. a single calibrator 
source) was also studied by us in [6], where a "column ratio 
method" (COLR) was proposed. Below, we generalize and 
improve that techniquqj. All these techniques are suboptimal 
in the case of multiple calibrator sources, but they can be used 
to provide the starting point for iterative refinement. 

The closed form solution by [20] is derived as follows. If 
the structure in v = g g is neglected and S„ is known or 
negligible, we can solve for v in the least squares sense by 



diag (vec (Rq)) ^ vec ^R - 



or equivalently 



gg« = R-S„ 0Ro. 



(30) 



(31) 



Note that the weights cancel since we solve for one parameter 
for each entry of R. Subsequently, use the structure of v and 
assume that the estimate of gg^ obtained in the first step has 
rank \. An eigenvalue decomposition is used to extract g as 
the dominant eigenvector from v. It is clear however that the 
noise on specific elements of v may be increased considerably 
if some entries of Ro have a small value. Also, the method is 
not applicable if S„ is unknown. 

In [7] another approach is therefore suggested based on the 
observation that gig^^ = Rik/Ro.ik holds for all off-diagonal 
elements of R, i.e., for i k. This implies that 



gi_ 

9j 



Rik/ Rq. 



ik 



RikRi 



Oj-fe 



9i9k 

9j9k Rjk/Ro,jk Ro,ikRjk 



(k^ij). (32) 



This relation is similar to the closure amplitude relation in 
astronomy [5], [24], which states that in an observation on a 

^We thank one of the reviewers for pointing this out. 
^^An initial presentation of this was in [7]. 
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single point source or, more generally, in the case of a rank 
1 model, the amplitude ratios are related as indicated by the 
second equality sign in ( l32b . 

Since the index k can be chosen freely as long as k ^ 
we can introduce Ci ^ being the column vector containing the 
values RikRojk and C2,ij being the column vector containing 
the values Ro.ikRjk for all possible values of k ^ We 
can now write ( |32] | in its more general form 



— C2,i] 

9j 



Cl, 



which has the well-known solution 



ft 
9j 



(33) 



(34) 



All possible gain ratios can be collected in a matrix M 
with entries A/y = cJ^^^Ci^ij. Since the model for M is 
Mij = gi/gj, the matrix M is expected to be of rank 1, 
and g can be extracted from this matrix using an eigenvalue 
decomposition: let Ui be the eigenvector corresponding to the 
largest eigenvalue of M, then g = aui, where the scaling a 
needs to be determined separately since the quotient gi/gj is 
insensitive to modification by a constant scaling factor applied 
to all gains. This scaling can be found by minimizing 



a = argmin |j aGRoG 



"a- HI 



(35) 



for all off-diagonal elements. By introducing the vectors Vq = 
vec_(GRoG^) and r = vcc_(R), where vcc_(-) operates 
like the vcc(-) operator but leaves out the elements on the main 
diagonal of this argument, this cost function can rewritten as 



(36) 



a = argmin |j \a\ rg 



which can be solved in the least squares sense by 3 = 

Equation (|34] | extends the column ratio method proposed in 
[6] to all elements of the matrix. The column ratio method 
was introduced to reconstruct the main diagonal of a rank 
1 source model, which allows to estimate this rank 1 model 
using an eigenvalue decomposition without distortion of the 
results due to unknown sensor noise powers. In our case this 
allows us to neglect the unknown receiver noise powers. The 
CRB analysis in [25] shows that it does not matter whether 
one simultaneously estimates the receiver noise powers and 
the omnidirectional complex gains exploiting all data, i.e. 
including the autocorrelations, or ignore the autocorrelations 
and solve for the direction independent complex gains only. 

The Monte Carlo simulations presented in [7] suggest that 
the method outlined above provides a statistically efficient 
estimate of g; the pseudo-inverse ensures that the noise on 
the entries of M does not increase dramatically due to small 
values in either R or Rq. It also provides a modification of 
the initial estimate proposed by Fuhrmann [20] which finds 
a near optimal solution to the least squares gain estimation 
problem without further optimization using, e.g., the Newton 
algorithm. This will generally save computational effort. In the 
simulations in this paper, we will use Eq. ( |34] | to demonstrate 
that this method gives statistically efficient results without 
requiring additional iterations using Eq. (|29T l. Moreover, if 



three or more iterations are required to ensure convergence 
using Eq. (|29] |. it also requires less computational effort, as 
discussed in Sec. IIV-GI 

Although the pseudo-inverse in Eq. (|34] | ensures robustness 
against small entries of either R or Rq, the fact that the 
method relies on gain ratios may lead to poor performance if 
there are small entries in g, e.g. due to failing array elements. 
This risk can be mitigated by rewriting Eq. (|33] | to 



By defining 



Yz = 



Cl = 



Co = 



C2,ijgi — Cl,ijgj- 



r T T „T y 



(37) 



Cl il 



Cl,i2 



Cl, 



fv^ • • • Y"^ 

[ I 1 : I 2 1 7 p 

yi 

y2 



yp 

we can enumerate i and j and collect all relations in a single 
matrix equation 

C2g = Cig. (38) 

This suggests that g can be found by searching for the null 
space of the p{p — 1)^ x p matrix C2 — Ci. By substituting 
the details of Ci.y and C2,ij in C2 — Ci, it is easily seen that 
g lies indeed in the null space and that this null space is one 
dimensional. However, this is only true for noise free data. In 
the practical case of noisy data, there will not be a null space 
and g lies in the noise subspace. Furthermore, finding g will 
involve a singular value decomposition on a very large matrix 
which may be computationally prohibitive. 

An alternative approach follows from recognizing that g is 
obtained from the principal right eigenvector of 



C2C1 



J2k Ro,ikRjkRo,jkRik 
Sfc Ro.ikRjk Ro,ikRjk 
__9z9jJ2k \9kf \RoAkf \Ro.jk\'^ 



\9,\'j:k\9k\'\Ro,zkf\R 



OJk I 



(39) 



which is easily found by substitution of the definition of Ci ^ 
and ij and use of the Moore-Penrose left inverse. Equation 
(39[ shows that g is the principal right eigenvector of CjCi, 
which has eigenvalue 1 in the noise free case. Simulations 
with completely randomized Rq and g suggest that the other 
eigenvalues are considerably lower than 1. Since it is easily 
demonstrated that the noise on actual data will lower the main 
eigenvalue, this is an important result; the contrast between 
the largest and the second-largest eigenvalue determines the 
susceptibility of this method to noise on the data. 

Note that these methods are still insensitive to a constant 
scaling factor applied to all gains. This ambiguity is solved 
by Eq. ( l36l l. Also note that all methods presented above 
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enumerate over i,j ^ fc to avoid the diagonal entries of 
the array covariance matrix which are affected by the system 
noise. By imposing more stringent restrictions on the choice 
of i and j, this method can also handle cases in which some 
elements in the array experience correlator noise, i.e. cases in 
which S„ is not diagonal but still has sufficient zero entries 
to allow the above methods to extract the required information 
for every combination of gi and gj. 

C. Source power estimation 

Based on the unweighted cost function k{6), a is found by 



argmm 

(T 

argmin 
argmin 

(T 



Hr^H 



R- GASA^G 



(GA)E(GA 



vec (r - S„) - (GA o GA) a 
(R-S„) 



1^ 

2 
F 



= (GA o GA) ' vec [Yl - S,^ 
If weighting is applied, a follows from 



(40) 



argmm 
cr 



We [R - S,, 
WcGASA^G^Wc 



argmin 
cr 



W,® W, 



;) vec 



(W,GA) o (W,GA) o-„ 

((WcGA) o (W^GA))^ X 
(Wc ® Wc) vec f R - £„ 



R 

2 
F 



(41) 

Using standard relations for Khatri-Rao and Kronecker prod- 
ucts and substituting W,; = R^a we obtain 



A«G»R-iGA (A^G^R-^GA 



vecdiag ( A^G^R"^ ( R - S ) R^^GA ) (42) 



This result confirms the observation by Ottersten, Stoica and 
Roy [23] that although the derivation involves a square root 
of the array covariance matrix, the final result only depends 
on the array covariance matrix and its inverse. 

D. Estimating receiver noise powers 

If S„ = diag (<T„) and no weighting is applied to the cost 
function, then is found by solving 

2 



(T„ = argmm 



R- GAEA-^G^ - 



(43) 



This estimation problem is the same as the sensor noise 
estimation problem treated in [6], so we just state the result, 
which is found to be 



CT„ = diag (^R - GASA 
If S„ — cr^jl, a similar derivation gives 



1 



tr ( R - GASA^G^ 



(44) 



(45) 



This result is just the average of the sensor noise estimates 
obtained when they are estimated individually. 

If the weighted cost function kw(0) is used, we can follow 
a derivation similar to the one that led us to the estimate for 
a in the previous section. We first note that 



argmin 



Wc (R - GRqG^ ) Wc 



argmin 



F 
W, 



(Wc o Wc) cr, 

it 



Wc) vec ( R - GRoG 

2 



H 



(Wc o Wc) ' (Wc ® Wc) vec (r - GRqG 



(46) 



Applying the standard Khatri-Rao and Kronecker product 
relations and inserting Wc = R^2 we get 

-1 



CT„ = (^R ^ © R X 

vecdiag {bt^ (^R - GRoG 
A similar derivation for S„ = cr^ I gives 



H 



R 



|R" 



tr R"i R- GRoG 



H 



R 



(47) 



(48) 



The true value of the covariance matrix R is not known in 
practical situations. Therefore the measured covariance matrix 
R is generally taken as estimate of R. It can be shown that this 
conventional approach leads to a noticeable bias in the estimate 
of CT„ for a finite number of samples N . For simplicity of 
the argument we will prove this statement for the case where 
R = cr^I, such that GRqG^ = 0. Inserting this in Eq. (l48T l 
gives 



2 tr ( R^RR^ 



R 



R-i 11/ tr(R-i 



(49) 



Since R ^ = (t„ ^I, we can write R ^ = ct„ ^ (I + E) where 
(T^^E = R^^ — R~^ represents a specific realization of the 



noise on the elements of R ^ . 
we obtain 



E) 11/ tr [a. 



After substitution in Eq. (l49T l 

2(I + E)) 



altT ((I 

al (tr(I 
X (tr (I) + tr (E)) . 



E) (I + E)^ j tr (I + E) 
- tr (E) + tr (E^) + tr (EE^))"^ 



X 

(50) 



Since E represents the noise on the data, the expected value 
of tr (E) and tr (E^) is zero. This implies that 



a,^(tr(I)+tr(EE«))-^r(I) 
II I 



2 
F 



I III 



E 



2 ■ 

F 



(51) 



This shows that the presence of noise systematically lowers the 
value of the estimate of cr^j. Since R^^ asymptotically in N 
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converges to R^^, E converges to zero. Therefore this result 
also shows that the cr,^j estimate asymptotically converges to 
the true value if the number of samples N approaches infinity. 

However, this bias can be avoided by using the best available 
knowledge of the estimated parameters, i.e., in ( l48T l use 

R = f A (£) SA^ (£) f ^ + S„. (52) 

In the first iteration of an alternating least squares algorithm, 
initial estimates of the parameters are used. These values 
are replaced by increasingly accurate estimated values in 
consecutive iterations. 

E. DOA estimation 

The problem of estimating C is that of estimating the direc- 
tion of arrival of signals impinging on a sensor array, and has 
been studied extensively. MUSIC [26] and weighted subspace 
fitting (WSF) [27], [28] are well-known statistically efficient 
DOA estimation methods applicable to arbitrary sensor arrays. 
In either case the eigenvalue decomposition of R is interpreted 
in terms of a noise subspace and a signal subspace, i.e. 
p 

R = ^ A,e,ef = E,A,,Ef + E„A„E^ (53) 

i=l 

where Ai > • • • > Ag > Ag+i > • • • > Ap. The noise 
eigenvalues of the whitened true covariance matrix are all 
equal and the number of sources can be derived from the 
distribution of the eigenvalues of R; we will assume that it 
is known. In practical situations in which only an estimate 
of the true covariance matrix is available, methods like the 
exponential fitting test [29], [30] or information theoretic 
criteria [31] must be used to determine the number of signals. 

It is quite straightforward to adapt the WSF method to our 
needs. The data model assumed in [28] can be described as 

R = A (£) SA^ (£) + all (54) 

where we have used A to avoid confusion with A introduced 
in this paper To map our data model on the data model 
described by Eq. (l54l i we should whiten our array covariance 
matrix using 

R^ = ^n^RSn^ = S;^GASA^G^S,T^ + I (55) 

and then set A = S„ ' GA. With these substitutions it is 
straightforward to implement the procedures described in [28]. 

F. Alternating Least Squares (ALS) and Weighted Alternating 
Least Squares (WALS) 

The ingredients of the previous four subsections can be 
combined to formulate a (Weighted) Alternating Least Squares 
solution to the stated optimization problems. We start by 
introducing an algorithm handling the first two scenarios 
identified in section DOA estimation is then added in a 
straightforward way. 

To estimate g, S and ct„ we propose the following (W)ALS 
algorithm. 

1) Initialization Set the iteration counter i = 1 and initialize 
ct'"' based on knowledge of eTg and directional response 



of the sensors. For WALS, define the weight Wc = 
R^2 . Initialize based on knowledge of the nominal 
position of the calibrator sources. 

2) Estimate g[*l by an eigenvalue decomposition of the 
matrix M with its entries obtained by averaging ( |32] | 
(resp. (l34b ) over all k ^ i,j or by an eigenvalue 
decomposition of the matrix CjCi as given by Eq. (l39T l 
using A and as prior knowledge. Note that neither 
gain calibration approach does require knowledge of the 
sensor noise powers (t„ and that the latter approach is 
advisable if some elements may have a very low gain. 

3) Estimate using either (|44]i or (|45]l (resp. (07]) or ( l48T l) 
applying available knowledge of g[*l, and A. 

4) Estimate o^'*' using (|40] | (resp. (l42l i) and knowledge of 
gW, and A. 

5) If the DOAs are inaccurately known (scenarios 3 and 4), 
estimate using WSF as described in subsection II V-EI 
using knowledge of gl'l and and initial estimate 

6) Check for convergence or stop criterion If I^I'^^ltgl'l ^ 
1| < 6 or i> imax, stop, otherwise increase i by 1 and 
continue with step 2. 

The proposed criterion for convergence is based on a measure 
of the average relative error in all parameters, and will work 
even if the parameter values differ by orders of magnitude. 

The extension with step |5] is referred to as the Extended 
ALS (xALS) resp. xWALS algorithm. 

An algorithm that alternatingly optimizes for distinct groups 
of parameters can be proven to converge if the value of 
the cost function decreases in each iteration. In [27], [28] it 
is demonstrated that WSF minimizes the least squares cost 
function w.rt. the parameterization of A, thus providing a 
partial solution to the least squares problem considered here. 
In step 2, g is estimated using a method that only provides 
a near optimal solution to the least squares cost function. 
Its solution could, however, not be discerned from the true 
solution in Monte Carlo simulations, as demonstrated later in 
this paper Optionally, step 2 could be augmented with one 
or two iterations of Eq. (|29] l to assure minimization of the 
least squares cost function. Alternatively, one could use the 
proposed estimate in the first iteration and use Eq. (l29T l in 
consecutive iterations in which a proper initial estimate is 
available from the previous iteration. The other parameters 
are estimated using well known standard solutions for least 
squares estimation problems. Therefore the value of the cost 
function is reduced in each step, thus ensuring convergence. 

G. Computational complexity 

Table |T] summarizes the numerical complexity of differ- 
ent stages of the ALS and WALS algorithms per iteration 
expressed in the number of complex multiplications. Nuer 
is the number of iterations required for convergence when 
using Eq. ( |29] l. In this table, the notation o(- • • ) is used to 
denote all the neglected lower order terms. The complexity 
of the omnidirectional gain estimates is dominated by the 
eigenvalue decomposition on M, which takes approximately 
Ap^ multiplications assuming use of the divide and conquer 
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TABLE I 

Complexity OF ALS and WALS iterations. 

method [32]. This step dominates the overall complexity 
as well, so it may be worthwhile to estimate the number 
of iterations of the power method [33] required to obtain 
sufficient accuracy, especially if p is large. 

The pseudo-inverse in Eq. (l40b can be implemented by a 
singular value decomposition which is computationally more 
efficient than direct computation of the Moore-Penrose inverse 
[33]. Some intermediate results, such as GASA^G^, are 
required more than once. In our calculation we assume that 
these terms are computed only once and stored for future use. 
Under this assumption the number of complex multiplications 
required to compute the noise power reduces to zero. The 
number of array elements p will generally be considerably 
larger than the number of source signals q. Therefore the q^-, 
pq^- and p^q-teims may be considered negligible compared to 
the p^-term in most cases. These results suggest that the better 
statistical performance of the WALS algorithm compared to 
the ALS algorithm as demonstrated by the simulation results 
presented in the next section comes with only a minor increase 
in complexity. 

Both algorithm can be augmented with source position 
estimates using WSF. In [28] it is demonstrated that WSF can 
be implemented with 0{pq^) complex operations per Gauss- 
Newton-type iteration of the proposed modified variable pro- 
jection algorithm once the result from the eigenvalue decom- 
position is available. The cost of WSF is therefore dominated 
by the eigenvalue decomposition on R requiring about Ap'^ 
complex multiplications. This makes the computational cost of 
source location estimation comparable to the cost of estimating 
the direction independent gains. 

V. Simulation results 

The methods proposed in the previous section were tested 
using Monte Carlo simulations and compared with the CRB. 
For these simulations a five-armed array was defined, each 
arm being an eight-element one wavelength spaced ULA. The 
first element of each arm formed an equally spaced circular 
array with half wavelength spacing between the elements. 
The source model used in the simulations is presented in 
Table This source model was created using random number 
generators to demonstrate that the proposed approach indeed 
works for arbitrary source models. 



TABLE 11 

Source powers and source locations used in the simulations 



# 


/ 


m 


-I 


1 


0.24651 


-0.71637 


1.00000 


2 


-0.34346 


0.76883 


0.88051 


3 


-0.13125 


-0.31463 


0.79079 


4 


-0.29941 


-0.52339 


0.74654 


5 


0.39290 


0.58902 


0.69781 



o CRB 

4 ' - ALS 

+ WALS 

3.5 
3 




20 40 60 80 100 

parameter index 

Fig. 2. Variance of the parameter vector 6 = [l'^,4>2r'' i</'40i 
(Tj, - - ,(^^,cr'n]'^ estimates obtained in Monte Carlo simulations for the 
weighted and unweighted versions of the alternating least squares algorithm 
and compared these results with the CRB. 



In the first simulation the direction independent gains, the 
source powers and the receiver noise powers were estimated 
assuming an array of equal elements, i.e. the parameter vector 
was defined as = [7-^, 4>2, - ■ ■ , <t>4o, cf, • • • cr|, ct^]"^ corre- 
sponding with the first scenario mentioned in section |ll] Data 
was generated assuming cr^ = 10 and N = 10^. This value 
for cr,^j implies an instantaneous SNR on the strongest source 
of only 0.1 per array element, a typical situation for radio 
astronomers. Figure |2] shows the variance of the estimated 
parameters based on 1000 runs and compares these values 
with the CRB. As expected the WALS method appears to be 
asymptotically statistically efficient while the ALS approach 
does not attain the CRB and shows clear outliers. 

Figure [3] shows the variance found on a number of rep- 
resentative parameters as function of the number of samples 
and compares this with the corresponding CRBs. This plot 
confirms the conjecture raised in the previous paragraph that 
the WALS method is asymptotically statistically efficient. 
Figure |4] shows the bias found in the Monte Carlo simulations 
and compares this with the statistical error of Itr for a single 
realization based on the CRB. This result indicates that both 
methods are unbiased for all parameters. 

Figure [5] shows the difference of the parameter value at the 
end of each iteration and its final value obtained from one 
run of the Monte Carlo simulation for a representative case, 
i.e. similar results were found for other parameters and other 
runs. The standard deviation based on the CRB is plotted as 
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Fig. 4. Bias on the estimated parameters (0 = [t^,</!'2i'', 4' AO , 
(t| , • ■ ■ , (T^, cr^]^) for the weighted and unweighted version of the alternating 
least squares algorithm in the Monte Carlo simulations. The bias is compared 
with the standard deviation on the estimates derived from the CRB. 



reference to facilitate the interpretation of the vertical scale. 
The results indicate that only one or two iterations are needed 
to reach the CRB in this scenario. With regard to other 
scenarios, this result suggests exponential convergence, i.e. 
each iteration adds about one significant digit to the parameter 
estimate, and it indicates that the WALS method converges 
more rapidly than the ALS method. In these simulation the 
stop criterion was |6>['"il^0W - 1| < 10"^". 

In the second simulation the direction independent gains, the 
apparent source powers and source locations and the receiver 
noise powers were estimated assuming an array of ideal 
elements. This corresponds to the third scenario described in 
section The parameter vector to be estimated is thus = 
h'^,(t)2,- ■ ■ , (/>40, cri, ■• • ,cri,cr2,/2, • • • ,15,1112,- ■■ ,1715]'^. 
Data were generated assuming a^^ = 10 and N = 1000. This 
again implies an instantaneous SNR of only 0.1 per array 
element but in this case the SNR per receiver is only 3.2 
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Fig. 5. The difference between the value of 75 at the end of each iteration 
and its final value is plotted versus the iteration number for ALS and WALS 
obtained in one of the Monte Carlo simulation runs. The behavior shown in 
this plot is representative for the results obtained for other parameters and 
from other runs. The standard deviation based on the CRB is also shown to 
demonstrate that one but preferably two iterations are sufficient to obtain a 
sufficiently accurate result. 
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Carlo simulations for the xWALS method, compared with the CRB. 



for the strongest source after integration. Figure |6] shows the 
variance on the estimated parameters based on 1000 runs and 
compares these results with the CRB. In these simulations 
we only used the xWALS algorithm, since the previous 
simulations demonstrated that the ALS method gives inferior 
results compared to the WALS approach. These results show 
that the variance on the parameter estimates are close to the 
CRB. 

Figure |7] shows the bias on the estimated parameters found 
in these Monte Carlo simulations. These results indicate that 
the estimates are unbiased. 

The convergence of a representative parameter is shown in 
Fig. [8] Again one but preferably two iterations are sufficient to 
get accurate results. However, in this scenario the stop criterion 
|0['-ilt0[«] _ 2| < 10^1° was not reached and the algorithm 
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deviation on the estimated derived from the CRB. 
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Fig. 8. The difference between the value of 75 at the end of each iteration 
and its final value is plotted versus the iteration number for one of the Monte 
Carlo simulation runs. The behavior shown in this plot is representative for 
the results obtained for other parameters and from other mns. The standard 
deviation based on the CRB is also shown to demonstrate that one but 
preferably two iterations are sufficient to obtain a sufficiently accurate result. 



Stops because the maximum number of iterations, which was 
set to 15, was reached. This indicates that the algorithm tries 
to interpret the noise as real signal. In the first iteration first 
the omnidirectional gains, apparent source powers and noise 
power are estimated. These values are then used in the WSF to 
find the source locations. The source locations are then used 
to update the sky model, leading to an update of 7, a and 
an and the cycle is complete. In the first simulation, the SNR 
after integration defined as \/Naf /a'^ was a factor 10 better 
than in this simulation in which the SNR per array element 
per source after integration is only 2.2 to 3.2. 

VI. Summary and conclusions 

In this paper we have developed a weighted alternating least 
squares (WALS) algorithm to solve for direction independent 



complex gains, apparent source powers and receiver noise 
powers simultaneously in a snapshot observation by a sensor 
array. Our solution for finding the direction independent gains 
extends and improves the available methods in the literature in 
the case of multiple calibrator sources and provides robustness 
to small entries in the array covariance matrix and small 
gain values (due to e.g. failing elements). Although we have 
assumed independent sensor noise powers, our gain estimation 
method is easily applied to cases in which some pairs of 
elements experience correlated noise. We also found that 
unbiased estimation of the receiver noise powers requires 
weighting with the best available array covariance matrix 
model instead of weighting with the measured array covariance 
matrix, which is common practice in signal processing. 

The statistical performance was compared with an un- 
weighted alternating least squares (ALS) algorithm and the 
CRB and found to provide a statistically efficient estimate 
thereby outperforming the ALS method. The computational 
complexity of ALS and WALS scale with 4^p^ and about 
respectively assuming that the number of calibrator sources 
is much smaller than the number of array elements, i.e. 
the WALS method comes with only a minor increase in 
computational burden. Simulations indicate that only one or 
two iterations are sufficient to reach the CRB and that every 
iteration adds about one significant digit, the WALS method 
converging slightly faster than the ALS method. 

We extended these methods with source location estimation 
using weighted subspace fitting. Simulations indicate that 
this extension to the WALS algorithm provides a statisti- 
cally efficient simultaneous estimate of omnidirectional gains, 
apparent source powers, source locations and receiver noise 
powers. Again, one to two iterations proved to be sufficient 
to reach the CRB. However, the convergence of the parameter 
is blocked at some level lower than the CRB by the noise in 
the measurement, at which point the algorithm keeps jumping 
from one local optimum to another. 
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